[Spark] 遇到的Spark问题

Posted by 李玉坤 on 2018-07-20

1、ERROR cluster.YarnClientSchedulerBackend: The YARN application has already ended! It might have been killed or the Application Master may have failed to start. Check the YARN application logs for more details.

原因可能是内存资源不足
yarn.nodemanager.pmem-check-enabled
是否检查每个任务正使用的物理内存量,如果超过默认值则将其杀死,默认是true。
yarn.nodemanager.vmem-check-enabled
是否检查每个任务正使用的虚拟内存量,如果超过默认值则将其杀死,默认是true。

yarn-site.xml文件添加

1
2
3
4
5
6
7
8
<property>
    <name>yarn.nodemanager.pmem-check-enabled</name>
    <value>false</value>
</property>
<property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
</property>

2、yarn 模式测试不可以使用cluster模式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[bigdate@hadoop conf]$ spark-shell --master yarn --deploy-mode cluster
Exception in thread "main" org.apache.spark.SparkException: Cluster deploy mode is not applicable to Spark shells.

正确启动方式:
[bigdate@hadoop conf]$ spark-shell --master yarn --deploy-mode client
20/07/06 21:45:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
20/07/06 21:45:37 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
Spark context Web UI available at http://hadoop:4040
Spark context available as 'sc' (master = yarn, app id = application_1594043119043_0001).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.4
/_/

Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_191)
Type in expressions to have them evaluated.
Type :help for more information.

scala>